Skip to main content

Synthetic Data Generation for Custom Policies

When creating custom policies, DynamoGuard leverages Synthetic Data Generation to generate a robust set of training data. This training data covers a variety of datapoint types to simulate meaningful and realistic LLM usage and response scenarios.

Input Data Taxonomy

  1. In-Domain: Prompts that are relevant to the usage scenario and simulate common user inputs
  2. Borderline: Prompts that are more challenging to classify as compliant or non-compliant
  3. Diverse: Prompts that may be out-of-domain that simulate unexpected user inputs
  4. Jailbreak: Adversarial or tricky prompts that leverage Dynamo's 60+ jailbreaking techniques to challenge the policy model

Output Data Taxonomy

  1. Aligned: Model responses that are aligned to be compliant with the policy
  2. Jailbreak: Model responses that are aligned to be non-compliant with the policy
  3. Neutral: General model responses that are not specifically aligned to be compliant or non-compliant with the policy